Goto

Collaborating Authors

 dense visual representation


Unsupervised Learning of Dense Visual Representations

Neural Information Processing Systems

Contrastive self-supervised learning has emerged as a promising approach to unsupervised visual representation learning. In general, these methods learn global (image-level) representations that are invariant to different views (i.e., compositions of data augmentation) of the same image. However, many visual understanding tasks require dense (pixel-level) representations. In this paper, we propose View-Agnostic Dense Representation (VADeR) for unsupervised learning of dense representations. VADeR learns pixelwise representations by forcing local features to remain constant over different viewing conditions.


Review for NeurIPS paper: Unsupervised Learning of Dense Visual Representations

Neural Information Processing Systems

A key limitation of this work is that their proposed network VADeR is always initialized with MOCO self-supervised pre-training. While this is benign for practical purposes, it does conflate the two methods, and also means that VADeR is trained for longer etc. Training randomly initialized network with the proposed method will provide crucial empirical evidence, and would only strengthen, not weaken the experiments and claims.


Unsupervised Learning of Dense Visual Representations

Neural Information Processing Systems

Contrastive self-supervised learning has emerged as a promising approach to unsupervised visual representation learning. In general, these methods learn global (image-level) representations that are invariant to different views (i.e., compositions of data augmentation) of the same image. However, many visual understanding tasks require dense (pixel-level) representations. In this paper, we propose View-Agnostic Dense Representation (VADeR) for unsupervised learning of dense representations. VADeR learns pixelwise representations by forcing local features to remain constant over different viewing conditions.